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Summary 

This  paper  reviews  the  process  of  selecting  officers  for 
U.S.  naval  aviation  training  and  describes  one  of  the 
principal  selection  tools,  the  Aviation  Selection  Test 
Battery  (ASTB).  The  1992  version  of  the  ASTB  is  a 
paper-and-pencil  test  administered  to  all  applicants  for 
naval  aviation  training.  ASTB  scores  and  ground  school 
and  flight  training  performance  data  were  available  for 
2852  student  naval  aviators  and  student  naval  flight 
officers,  and  these  data  were  used  to  re-assess  the  validity 
of  the  ASTB  in  predicting  student  performance.  The 
results  indicated  that  the  ASTB  remains  a valid  predictor 
of  ground  school  and  flight  training  grades,  and  to  a lesser 
extent,  attrition  from  training.  For  a small  subset  of  the 
sample  used  in  these  analyses,  data  from  a computer-based 
performance  test  (CBPT)  were  also  available.  The  CBPT 
required  subjects  to  engage  in  multi-axis  tracking  tasks 
concurrently  with  other  cognitive  tasks,  such  as  dichotic 
listening  and  working  memory  tasks.  Scores  from  the 
ASTB,  the  CBPT,  and  grades  from  ground  school  were 
entered  into  a linear  regression  upon  primary  flight 
training  grades.  The  results  showed  that  the  combination 
of  ground  school  and  CBPT  scores  can  be  used  as  a good 
predictor  of  performance  ( R 2 = .33,  p <.0001).  Although 
these  results  will  require  cross  validation,  the  CBPT  shows 
promise  as  a new  selection  tool.  The  importance  of  these 
results  is  discussed  in  the  context  of  a recently  developed 
computer-based  version  of  the  ASTB. 

Introduction 

Earning  the  wings  of  a U.S.  naval  aviator  is  a goal  that 
many  seek.  Each  year,  approximately  10,000  individuals 
demonstrate  this  interest  by  taking  the  U.S.  Navy  and 
Marine  Corps  Aviation  Selection  Test  Battery  (ASTB). 
The  ASTB  is  one  of  the  initial  filters  in  selecting  students 
for  training  as  either  pilots  or  naval  flight  officers  (NFOs, 
who  perform  navigation  and  weapons  systems  duties  in  the 
cockpit).  This  paper  describes  the  ASTB  and  reviews  the 
aviator  selection  process,  and  then  presents  analyses  that 
were  conducted  on  data  from  existing  and  potentially  new 
methods  of  selecting  U.S.  Navy  pilots. 

The  ASTB 

The  ASTB  was  originally  introduced  in  1942,  and 
revisions  followed  in  1953,  1971,  and  1992  (Frank  & 


Baisden,  1993).  The  current  1992  version  was  developed 
and  validated  by  Educational  Testing  Services  of 
Princeton,  New  Jersey.  It  is  a paper-and-pencil  test  that 
takes  approximately  2.5  hours  to  administer,  and  consists 
of  six  sub-tests.  The  six  sub-tests  are  the  math-verbal  test, 
the  mechanical  comprehension  test,  the  spatial 
apperception  test  (which  measures  spatial  reasoning 
abilities),  the  aviation  and  nautical  information  test,  the 
biographical  inventoiy  (which  contains  questions  on 
personal  history  and  interests),  and  the  aviation  interest 
test.  Weighted  combinations  of  the  sub-tests  are  used  to 
calculate  the  following  three  scores  used  in  the  pilot 
selection  process: 

1.  The  academic  qualification  rating  (AQR)  - validated 
to  predict  academic  performance  in  ground  school. 

2.  Pilot  Flight  Aptitude  Rating  (PFAR)  - validated  to 
predict  flight  grades  in  primary  flight  training. 

3.  Pilot  Biographical  Inventory  (PBI)  - validated  to 
predict  attrition  through  primary  flight  training. 

The  Naval  Operational  Medicine  Institute  (NOMI) 
oversees  the  ASTB  testing  program,  including  test 
distribution,  official  scoring,  and  database  management. 

The  Selection  Process 

The  ASTB  plays  an  early  role  in  narrowing  down  the  very 
large  field  of  those  who  apply  for  naval  aviation  training. 
Data  provided  by  NOMI  show  that  approximately  half  of 
those  taking  the  ASTB  fail  to  meet  minimum  selection 
scores.  Those  who  score  favorably  must  then  undergo  a 
thorough  physical  examination  to  ensure  that  they  meet 
medical  standards.  Approximately  25%  do  not  pass  the 
physical  screening  process.  Those  who  remain  eligible  are 
interviewed  by  two  officers  who  complete  an  evaluation 
form  on  the  applicant,  and  the  applications  are  forwarded 
to  a three-member  evaluation  board.  This  board  usually 
consists  of  two  naval  aviators  and  a program  manager  who 
is  knowledgeable  of  current  and  projected  demands  for 
naval  aviators.  Approximately  half  of  the  applications  are 
recommended  for  selection  by  the  board.  Upon  final 
approval,  the  selected  applicants  are  offered  the 
opportunity  to  enter  naval  aviation  training.  Overall,  then, 
only  about  15%  of  those  who  take  the  ASTB  are  selected 
to  begin  training. 


Paper  presented  at  the  RTO  HFM  Workshop  on  “Officer  Selection”, 
held  in  Monterey,  USA , 9-11  November  1999,  and  published  in  RTO  MP-55. 
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Applicants  who  were  selected  from  the  U.S.  Naval 
Academy  (USNA)  or  from  Naval  Reserve  Officer  Training 
Corps  (NROTC)  programs  begin  6 weeks  of  ground  school 
training  at  Aviation  Pre-flight  Indoctrination  (API).  API  is 
located  at  Naval  Air  Station  Pensacola,  and  students  must 
master  topics  such  as  aerodynamics,  fundamentals  of 
turbine  engines,  air  navigation,  flight  rules  and  regulations, 
aviation  physiology,  and  water  survival.  Applicants  who 
did  not  graduate  from  either  the  USNA  or  an  NROTC 
program  must  first  complete  13  weeks  of  Officer 
Candidate  School  before  beginning  API.  After  completing 
API,  pilots  and  NFOs  proceed  to  separate  primary  flight 
training  programs. 

Given  the  important  role  that  the  ASTB  plays  in  the 
selection  process,  it  is  important  to  assess  its  validity 
continually.  Frank  and  Baisden  (1993)  and  Hiatt, 
Mayberry,  and  Sims  (1997)  have  examined  the  predictive 
validities  of  ASTB  scores,  and  their  findings  are 
summarized  in  Table  1.  The  r values  represent 
correlations  uncorrected  for  restriction  of  range.  Note  the 
negative  association  between  PBI  scores  and  attrition 
status,  indicating  that  those  with  higher  PBI  scores  are  less 
likely  to  fail  out  of  primary  flight  training. 

Table  1 

Previously  reported  correlations  between  ASTB  scores  and 
criterion  variables 


Frank  & Baisden 

Hiatt,  et  al. 

(1993) 

(1997) 

AQR  : academic 

it 

L 

o 

r - .42 

performance  in  API 

p not  reported 

p < .05 

PFAR  : primary  flight 

.27 

r = .40 

training  grades 

p not  reported 

p < .05 

PBI : attrition  from 

r = -.25 

r=  - .12 

primary  flight  training 

p not  reported 

p < .05 

Damos  (1996)  reviewed  correlations  between  flight 
training  performance  and  a wide  variety  of  other  aviation 
selection  tests  and  found  results  comparable  to  those  for 
the  ASTB.  Although  the  correlations  are  statistically 
significant,  the  best  that  can  be  said  for  most  of  them, 
including  those  for  the  ASTB,  is  that  they  are  only 
moderately  strong.  Selecting  pilot  candidates  and 
predicting  their  flight  training  performance  is 
unquestionably  a very  difficult  and  complex  endeavor,  yet 
it  seems  that  we  should  be  able  to  do  better. 

For  several  decades,  scientists  at  the  Naval  Aerospace 
Medical  Research  Laboratory  (NAMRL)  have  been 
developing  aviator  selection  tests  that  could  be  used  in 
conjunction  with  the  ASTB.  These  efforts  have  been 
reviewed  by  Blower  and  Dolgin  (1991).  Many  of  these 
tests  are  computer-based  and  measure  a participant’s 
cognitive  and  psychomotor  skills  in  both  single-  and  dual- 
task/divided attention  contexts.  The  fact  that  these  tests 
include  psychomotor  tasks  that  must  be  performed  in  a 
divided  attention  setting  brings  them  a step  closer  to 


representing  what  is  demanded  of  the  pilot  in  the  cockpit, 
as  compared  to  the  paper-and-pencil  ASTB.  With  this  in 
mind,  we  set  out  to  reexamine  the  validity  of  the  aging 
ASTB  and  to  identify  any  incremental  validity  that 
computer-based  tests  could  add  to  the  current  methods 
used  to  select  applicants  into  aviation  training. 

Method 

As  part  of  an  ongoing  project,  NAMRL  has  obtained  a 
large  set  of  ASTB  and  flight  training  scores.  ASTB  scores 
were  provided  by  NOMI,  API  scores  by  Naval  Aviation 
Schools  Command  (NASC),  and  flight  training  grades  by 
Training  Wing  Five  and  the  Chief  of  Naval  Air  Training 
(CNATRA). 

The  first  goal  of  analyzing  the  data  was  to  determine  the 
degree  of  association  between  ASTB  AQR  scores  and  API 
grades.  AQR  scores  and  API  grades  were  available  for 
2852  individuals.  This  group  included  students  in  both  the 
pilot  and  NFO  programs.  Since  the  ground  school 
curriculum  at  API  is  identical  for  pilots  and  NFOs,  we 
decided  to  include  both  groups  in  the  analysis.  The  group 
consisted  of  2687  males  and  165  females,  and  they  were 
enrolled  in  API  between  November  1993  and  October 
1998.  The  Pearson  correlation  coefficient  between  AQR 
and  API  scores  was  calculated  for  this  group. 

The  second  goal  was  to  find  the  strength  of  association 
between  the  PFAR  and  primary  flight  training  grades  for 
the  student  pilots  in  the  sample  described  above.  There 
were  1660  individuals  for  whom  both  PFAR  and  primary 
flight  grades  were  available.  Of  this  group,  1573  were 
male  and  87  were  female.  These  students  were  enrolled  in 
primary  flight  training  between  November  1993  and  July 
1998.  The  Pearson  correlation  coefficient  between  PFAR 
and  primary  flight  training  grades  was  calculated  for  this 
group. 

The  third  goal  was  to  determine  the  strength  of  association 
between  PBI  scores  and  attrition  status  for  students  in  the 
sample.  The  PBI  was  originally  validated  to  predict 
attrition  due  to  flight  failure,  drop  on  request  (voluntarily 
withdrawing  oneself  from  training),  or  academic  failures. 
Therefore,  cases  of  attrition  due  to  medical,  family 
hardship,  or  unidentified  reasons  were  removed  from  the 
sample.  For  the  remaining  cases,  an  attrition  variable  was 
created  and  coded  as  0 for  those  who  successfully 
completed  primary  training  or  1 for  those  who  failed  to 
complete  due  to  attrition  from  either  API  or  primary  flight 
training.  In  a total  of  1849  cases  available  for  this 
analysis,  1744  were  male  and  105  were  female,  and  they 
were  enrolled  in  API  between  September  1993  and 
October  1998.  Again,  the  correlation  coefficient  between 
PBI  scores  and  attrition  status  was  calculated  for  this 
group. 

In  addition  to  the  selection  and  training  data  described 
above,  NAMRL  researchers  collected  psycho  motor  task 
data  on  210  student  pilots  who  were  waiting  to  begin  API. 
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All  subjects  participated  on  a voluntary  basis.  The  200 
males  and  10  females  were  enrolled  in  API  between 
October  1995  and  February  1999.  Data  were  collected 
using  the  Computer  Based  Performance  Test  (CBPT) 
battery  (Blower  & Dolgin,  1991),  which  includes  a series 
of  tracking  and  information  processing  tasks  presented  in 
single-  and  dual-task  contexts.  The  CBPT  battery  runs  on 
personal-computer-type  processors,  and  for  this  study 
IBM-compatible  486  processor-based  machines  were  used. 
Each  of  the  CBPT  test  stations  includes  two  commercially 
available  joysticks,  a set  of  rudder  pedals,  a set  of  stereo 
headphones,  and  a numeric  keypad  that  the  subject  uses  for 
keyboard  inputs.  The  tracking  tasks  are  presented  on  a 
standard  VGA  monitor. 

The  first  task  in  the  CBPT  is  a two-dimensional  (2-D) 
compensatory  tracking  task  in  which  the  subject  uses  a 
joystick  to  keep  a cursor  centered  over  a set  of  crosshairs 
that  intersect  in  the  middle  the  computer  screen.  The 
cursor  is  continuously  driven  by  horizontal  and  vertical 
disturbance  functions  that  work  to  displace  the  cursor  from 
the  center.  The  computer  records  combined  horizontal  and 
vertical  error  as  cursor  pixel  distance  from  the  center  of  the 
crosshairs.  The  difficulty  of  this  task  is  increased  by  the 
fact  that  the  cursor  is  reverse-controlled  in  the  horizontal 
axis.  That  is,  moving  the  joystick  to  the  left  moves  the 
cursor  to  the  right,  and  vice  versa.  In  the  vertical  axis, 
control  is  more  stereotypical.  Moving  the  joystick  forward 
moves  the  cursor  downward;  moving  the  joystick  aft 
moves  the  cursor  upward.  Subjects  are  instructed  to  use 
their  right  hand  to  control  the  joystick.  The  2-min  2-D 
tracking  task  is  preceded  by  a 2-min  practice  session. 

The  second  part  of  the  CBPT  is  a dichotic  listening  task 
(DLT)  that  requires  the  subject  to  selectively  attend  to 
information  presented  to  either  the  left  or  right  ear.  Two 
different  streams  of  letters  and  single  digit  numbers  are 
simultaneously  presented  to  each  ear  over  the  headphones. 
The  subject  must  pick  out  each  number  presented  to  the 
target  ear  and  enter  the  number  via  the  numeric  keypad. 
The  computer  assigns  the  target  ear  before  each  trial. 
There  are  12  trials,  each  presenting  9 numbers  and  13 
letters  to  each  ear.  Subjects  receive  four  practice  trials 
before  beginning  the  DLT,  which  takes  5 min  to 
administer.  The  number  of  correct  responses  is  recorded 
automatically. 

The  third  part  of  the  CBPT  requires  the  subject  to 
simultaneously  perform  both  the  2-D  tracking  task  and  the 
DLT.  The  computer  presents  5 min  of  the  2-D  tracking 
task,  during  which  the  subject  engages  in  the  DLT. 

The  fourth  task  in  the  CBPT  adds  an  additional  cursor  that 
moves  only  in  the  horizontal  axis  at  the  bottom  of  the 
computer  screen.  The  subject  must  keep  this  cursor 
centered  using  the  rudder  pedals,  while  still  keeping  the 
original  2-D  tracking  task  cursor  centered  with  the 
joystick.  Rudder  cursor  input  control  is  conventional:  left 
rudder  input  moves  the  cursor  left,  while  right  rudder  input 


moves  the  cursor  right.  This  fourth  task  is  2 min  in 
duration  and  is  preceded  by  a 2-min  practice  session. 

The  fifth  CBPT  task  adds  the  DLT  to  the  fourth  task. 
Subjects  engage  in  the  2-D  tracking  task,  the  rudder 
tracking  task,  and  the  DLT  simultaneously  for  a 2-min 
practice  session  and  then  begin  the  2-min  test  session. 

The  sixth  CBPT  task  adds  yet  another  cursor  that  moves 
only  in  the  vertical  axis  along  the  left  side  of  the  computer 
screen.  The  subject  must  keep  this  cursor  centered  in  the 
vertical  axis  with  a second  joystick  mounted  on  the  left 
side  of  the  test  station.  Subjects  are  instructed  to  use  their 
left  hand  to  manipulate  this  second  joystick.  Cursor 
control  is  again  conventional:  forward  stick  input  moves 
the  cursor  downwards  while  aft  stick  input  moves  the 
cursor  upwards.  In  this  sixth  task,  the  subject  must  also 
keep  the  original  2-D  cursor  and  the  rudder  cursor  centered 
with  the  right  hand  joystick.  There  is  no  DLT  associated 
with  this  three-cursor  task,  and  2 min  of  practice  precede  2 
min  of  testing. 

The  seventh  CBPT  task  is  also  a tracking  task,  but  it  is  not 
associated  with  or  added  to  any  of  the  tasks  described 
above.  It  is  a one-dimensional  (horizontal)  tracking  task 
that  requires  the  subject  to  keep  a cursor  centered  on  a 
target  within  a horizontal  rectangle.  Control  mapping  of 
the  cursor  is  standard  in  that  left  joystick  movement  moves 
the  cursor  to  the  left,  and  right  input  moves  it  right. 
Similar  to  all  of  the  other  previous  tracking  tasks,  the 
cursor  is  continuously  driven  by  a disturbance  function 
that  works  to  displace  the  cursor  off  center.  The  subject 
engages  in  six  2-min  trials,  with  a 30-s  rest  period  between 
trials. 

The  eighth  CBPT  task  is  a working  memory  task  in  which 
the  subject  must  calculate  the  absolute  difference  between 
single  digit  numbers  that  are  sequentially  presented  on  the 
computer  monitor.  In  all  cases,  the  correct  answer  ranges 
from  1 to  4,  and  the  subject  is  instructed  of  this  fact.  The 
subjects  input  their  responses  via  the  numeric  keypad 
using  their  left  hand,  and  the  computer  automatically 
records  the  number  of  correct  responses.  The  task  is  self- 
paced  in  that  each  response  causes  the  next  number  to 
appear  on  the  screen.  The  absolute  difference  (AD)  task  is 
presented  as  a single  2-min  test. 

The  ninth  task  is  a dual-task  combination  of  the  horizontal 
tracking  task  and  the  absolute  difference  task.  Subjects 
engage  in  three  2-min  trials  of  this  dual-task  test. 

The  tenth  and  final  test  of  the  CBPT  is  a mental  rotation 
task  called  the  Manikin  Test.  In  the  CBPT  version  of  the 
Manikin  Test,  simplified  drawings  of  a sailor  appear  on  the 
computer  monitor.  The  sailor  is  holding  a red  square  in 
one  hand  and  a green  circle  in  the  other.  The  object  in 
each  hand  alternates  randomly  and  sailor  appears  randomly 
in  one  of  four  orientations:  upright  and  facing  the  subject, 
upright  with  his  back  towards  the  subject,  upside  down  and 
facing  the  subject,  or  upside  down  with  his  back  towards 
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the  subject.  The  subject's  task  is  to  quickly  determine 
which  of  the  sailor's  hands  (right  or  left)  is  holding  the  red 
square.  Subjects  indicate  their  response  by  pressing  one  of 
two  keys  on  the  numeric  keypad.  The  Manikin  Test  is 
self-paced,  with  each  response  triggering  the  next  stimulus. 
The  computer  automatically  records  the  number  of  correct 
responses.  This  test  is  composed  of  four  2-min  trials. 

For  all  of  tracking  tasks  listed  above,  subjects  were 
instructed  to  maximize  tracking  accuracy.  For  the  DLT, 
AD,  and  Manikin  tasks,  subjects  were  instructed  to 
respond  as  quickly  and  accurately  as  possible.  On  dual- 
task tests,  subjects  were  instructed  to  perform  as  well  as 
possible  on  each  task,  and  to  give  each  equal  priority. 

The  CBPT  provides  a source  of  at  least  10  variables  that 
might  be  of  use  in  predicting  primary  flight  training 
performance.  API  grades  and  the  3 ASTB  scores  increase 
this  number  to  a pool  of  14.  Our  fourth  goal  was  to  reduce 
this  to  a more  practical  number  and  then  conduct  an 
exploratory  analysis  to  identify  promising  predictors.  To 
narrow  down  the  large  number  of  potential  predictors, 
variables  were  selected  according  to  three  decision 
strategies: 

1.  If  there  were  a priori  reasons  to  believe  that  a variable 
would  make  a good  predictor,  it  was  selected  for 
analysis.  This  criterion  pointed  to  the  PFAR  score, 
which  has  been  shown  to  predict  primary  flight  grade, 
and  API  grades  because  the  API  curriculum  is 
designed  to  cull  out  students  who  are  likely  to  have 
trouble  in  primary  flight  training. 

2.  Variables  that  were  measures  of  dual-  or  multi-task 
performance  were  favored  because,  at  a basic  level, 
such  performance  is  what  is  required  of  the  pilot  in  the 
cockpit.  However,  more  complicated  psychomotor 
test  batteries  are  often  burdened  with  reliability, 
calibration,  and  quality  control  problems  that  have  led 
to  a poor  histoiy  of  wide-scale  implementation 
(Griffin  & Koonce,  1996;  North  & Griffin,  1977). 
With  this  in  mind,  the  CBPT  variables  that  required 
the  rudder  pedals  or  more  than  one  joystick  were 
eliminated  and  the  following  variables  were  chosen: 

a)  2-D  tracking  task  scores  and  DLT  scores, 
where  these  tasks  were  performed  in  combination 
with  each  other. 

b)  Horizontal  tracking  task  scores  and  AD  task 
scores,  again  where  these  tasks  were  performed  in 
combination  with  each  other.  Because  there  were 
three  trials  in  this  set,  scores  were  averaged  across 
the  trials. 

3.  We  also  decided  to  include  the  Manikin  Test  variable, 
because  this  task  is  unique  in  that  it  requires  the  subject  to 
engage  in  mental  rotation,  rather  than  in  a tracking  task. 

These  procedures  reduced  the  pool  of  potential  predictors 
to  the  following  seven:  PFAR  score,  API  score,  2-D 
tracking  error,  DLT  score,  horizontal  tracking  error,  AD 


score,  and  Manikin  Test  score.  The  variables  were 
examined  for  extreme  outliers,  as  defined  by  values  more 
than  three  standard  deviations  above  or  below  the  mean. 
This  procedure  eliminated  2-D  tracking  data  for  four 
subjects,  horizontal  tracking  and  Manikin  data  for  three 
subjects,  and  DLT  data  for  two  subjects.  Also,  PFAR 
scores  were  not  available  for  nine  subjects. 

The  remaining  data  were  then  analyzed  in  a stepwise  linear 
regression  upon  primary  flight  grade.  The  p-\ alue  to  enter 
was  set  at  p < 0.05,  and  tire  value  to  remove  was  set  at  p > 
0.10. 

Results 

Correlation  Analyses 

Summary  statistics  for  all  variables  analyzed  in  the 
correlation  analysis  of  ASTB  scores,  API  grades,  and 
primary  flight  grades  are  presented  in  Table  2 below. 

Table  2 

Summary  Statistics  of  ASTB  Scores,  API  Grades,  and 
Primary  Flight  Grades 


Variable 

| Mean 

AQR 

23.4 

2852 

API  Grade 

WEB 

6.9 

2852 

PFAR 

207.5 

23.6 

Primary 
Flight  Grade 

47.6 

10.4 

PBI 

8.6 

| 1849 

The  analysis  of  association  between  AQR  scores  and  API 
grades  showed  a significant  correlation  between  the  two 
variables  (r  = .47,  p < .0001,  two-tailed),  indicating  that 
API  grades  increase  with  increasing  AQR  scores.  The 
analysis  of  PFAR  scores  and  primary  flight  grades  also 
yielded  significant  results  (r  - .36,  p < .0001,  two-tailed), 
indicating  tliat  primaiy  flight  grades  increase  with 
increasing  PFAR  scores.  The  final  correlation  analysis 
was  between  PBI  scores  and  attrition  status.  Individuals 
who  failed  out  of  the  program  were  coded  with  a value  of 
1 for  this  variable,  while  those  who  successfully  completed 
were  coded  with  a 0.  This  analysis  also  revealed  a 
significant  correlation  between  the  two  variables  (r  = -.  10, 
p < .0001,  two-tailed).  Those  with  higher  PBI  grades  were 
less  likely  to  fail  out  of  the  program. 

Regression  Analysis 

The  stepwise  multiple  regression  analysis  yielded  a two- 
variable  model  for  predicting  primary  flight  grade.  The 
results  are  summarized  in  Table  3.  The  variables  included 
in  the  model  were  API  grade  and  2-D  tracking  error,  and 
they  accounted  for  33%  of  the  variance  seen  in  primary 
flight  grades  (adjusted R2=  .33 ,p<  .0001). 
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Table  3 

Summary  of  Regression  for  Variables  Predicting  Primary 
Flight  Grade 


Step 

1 

2 

Variable 

API  Grade 

2-D  Tracking  Score 

R> 

.251 

.339 

Adj.  1? 

.247 

.332 

Air  ' 

.251 

.088 

B 

.689 

-.0002 

SEB 

.092 

.0001 

p 

.443 

-.303 

p<  . 1 

.0001 

.0001 

Note.  A R2  for  step  2,p<  .0001 

Although  API  grades  were  included  in  the  model  above, 
they  are  not  available  until  after  the  student  completes  the 
6-week  API  curriculum.  However,  AQR  scores  are 
available  early  in  the  application  process,  fairly  soon  after 
the  applicant  takes  the  ASTB.  Because  AQR  scores  were 
shown  to  be  good  predictors  of  API  grades,  we  decided  to 
run  a second  regression  analysis  similar  to  the  first,  but 
replacing  API  grades  with  AQR  scores.  This  analysis  also 
yielded  a two-variable  model  that  included  2-D  tracking 
error  and  PFAR  score,  rather  than  AQR  score.  This  model 
accounted  for  17%  of  the  variance  in  primaiy  flight  grade 
(adjusted  R2  = .173,  p < .0001),  and  is  summarized  in 
Table  4.  Table  5 presents  summary  statistics  for  all 
variables  used  in  the  regression  analyses. 

Table  4 

Summary  of  Second  Regression  for  Variables  Predicting 
Primary  Flight  Grade,  Replacing  API  Grade  With  AQR 
Score 


Step 

1 

2 

Variable 

2-D  Tracking  Score 

PFAR  Score 

R 2 

.150 

.181 

Adj.  R1 

.146 

.173 

A R2 

.150 

.031 

B 

-.0003 

.0961 

SEB 

.0001 

.0352 

0 1 

-.376 

.1773 

P< 

.0001 

.0001 

Note.  A R2  for  step  2,  p = .007 


Table  5 

Summary  Statistics  of  Variables  Used  in  the  Regression 
Analyses 


Variable 

Mean 

SD 

N 

AQR 

192.4 

19.7 

201 

PFAR 

213.1 

18.7 

201 

2-D  Tracking  Error 

28033.9 

13127.4 

206 

DLT 

98.6 

8.9 

208 

Horizontal  Tracking  Error 

29414.7 

13646.4 

207 

AD 

59.3 

13.7 

210 

Manikin 

83.1 

19.5 

207 

API  Grade 

52.1 

6.9 

210 

Primary  Flight  Grade 

51.0 

10.1 

210 

Discussion 

The  purpose  of  the  efforts  described  in  this  paper  was  to 
reexamine  the  predictive  validities  of  the  ASTB  and  to 
explore  possibilities  for  new  tests  that  could  improve  the 
U.S.  naval  aviator  selection  process.  Although  the  current 
version  of  the  ASTB  was  introduced  7 years  ago,  the  r 
values  found  here  generally  indicate  that  it  is  still 
performing  well.  The  AQR  was  designed  to  predict  API 
grades,  and  the  correlation  analysis  of  these  two  variables 
shows  that  as  AQR  scores  increase  so  do  API  grades.  By 
squaring  the  correlation  coefficient  of  r - .47 , we  see  that 
AQR  scores  can  account  for  some  22%  of  the  variance  in 
API  grades.  This  correlation  coefficient  of  r = .47  is 
somewhat  stronger  than,  yet  still  consistent  with,  results 
reported  elsewhere  (Frank  & Baisden,  1993;  Hiatt  et  al., 
1997).  It  also  compares  favorably  with  other  types  of 
aviation  selection  tests  (see  Damos,  1996). 

It  should  be  emphasized  that  the  relationship  between 
AQR  scores  and  API  grades  was  observed  within  a sample 
of  rigorously  selected  candidates.  Therefore,  the  range  of 
values  for  both  the  predictor  and  outcome  variables  is 
certainly  restricted  as  compared  to  what  would  be  seen  if 
all  applicants  were  permitted  to  enter  training.  This 
condition  limits  the  potential  strength  of  association.  The 
same  range  restriction  is  operating  on  all  of  the  analyses 
reported  here. 

The  relationship  between  PFAR  scores  and  primary  flight 
grades  also  remains  fairly  strong,  with  an  observed  r - .36. 
This  value  falls  in  between  those  reported  by  Frank  and 
Baisden  (1993)  and  Hiatt  et  al.  (1997),  but  it  is  generally 
consistent  with  them  (see  Table  1).  The  fact  that  a simple, 
inexpensive,  paper-and-pencil  test  can  predict  cockpit 
performance  as  well  as  this  one  does  is  impressive,  and  we 
can  conclude  that  the  PFAR  continues  to  serve  its  purpose 
well. 

The  PBI  was  originally  validated  to  predict  attrition  up 
through  the  primary  flight  training  portion  of  flight 
instruction.  The  correlation  coefficient  between  PBI 
scores  and  attrition  status  in  our  analysis  was  r = -.10. 
Although  this  value  was  statistically  significant  and 
indicates  that  those  with  higher  PBI  scores  are  less  likely 
to  fail  in  training,  this  association  is  not  very  strong.  It 
was  comparable  to  that  reported  by  Hiatt  et  al.  (1997)  but 
weaker  than  that  reported  by  Frank  and  Baisden  (1993). 
Given  that  the  predictive  power  of  the  AQR  and  PFAR 
have  held  up  over  the  years,  this  result  is  somewhat 
puzzling,  but  we  offer  a possible  explanation. 

The  ASTB  was  validated  by  Educational  Testing  Services 
on  a sample  of  individuals  who  had  taken  the  test  once, 
and  the  r values  reported  by  Frank  and  Baisden  (1993) 
reflect  this  validation.  The  current  ASTB  testing  policy 
states  that  an  individual  may  take  the  test  for  a second  time 
30  days  after  the  first  testing.  After  the  second  testing, 
retesting  is  allowed  at  180-day  intervals,  and  there  is  no 
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limit  to  the  number  of  retests.  In  all  cases,  the  most  recent 
scores  replace  any  previous  scores.  Individuals  who  take 
the  ASTB  for  the  first  time  and  receive  a low  score  on  the 
PBI  may  be  inclined  to  change  their  PBI  answers  upon 
retesting,  in  order  to  improve  the  score.  The  nature  of  the 
PBI  lends  itself  to  this  sort  of  behavior.  By  contrast,  a 
person  must  increase  his/her  knowledge  of  the  subject 
matter  covered  on  the  portions  of  the  test  used  to  compute 
the  AQR  and  PFAR  scores  (i.e.,  the  math-verbal, 
mechanical  comprehension,  spatial  apperception,  and 
aviation  and  nautical  information  tests).  This  is  a much 
more  difficult  proposition,  and  may  account  for  the 
consistent  validities  of  these  scores.  By  comparing  the 
predictive  validities  of  one-test-only  PBI  scores  to  re-test 
PBI  scores,  the  accuracy  of  this  explanation  could  be 
determined.  It  may  well  be  the  case  that  one-test-only  PBI 
scores  have  retained  their  original  validity,  and  this  seems 
to  be  an  appropriate  issue  for  future  analysis. 

The  regressions  conducted  in  the  exploratory  analysis  of 
the  CBPT  scores  were  performed  to  identify  the  best 
variables  for  predicting  primary  flight  training 
performance,  and  stepwise  procedures  were  chosen  for  this 
purpose.  We  are  aware  that  stepwise  regression  is 
sometimes  criticized  for  its  increased  exposure  to  the 
possibility  of  capitalizing  upon  chance.  However,  as 
Hayes  (1988)  has  pointed  out,  in  exploratoiy  analyses  such 
procedures  are  appropriate  provided  that  selected  variables 
are  subject  to  subsequent  independent  validation. 
Accordingly,  we  would  indeed  cross-validate  any  new 
selection  test  before  recommending  it  for  implementation. 

The  results  of  the  regression  analyses  were  straightforward 
and  promising  at  least  from  a research  standpoint.  In  the 
first  regression,  the  model  included  API  scores  and  2-D 
tracking  scores,  accounting  for  33%  of  the  variance  in  the 
primary  flight  grades.  Prior  to  conducting  this  analysis, 
the  best  predictor  of  primary  flight  grade  was  the  PFAR 
score,  explaining  13%  of  the  variance  in  the  correlation 
analysis  sample.  Thus  a model  that  can  include  API 
grades  and  2-D  tracking  scores  represents  a substantial 
improvement. 

From  a practical  selection  standpoint,  however,  the  utility 
of  this  first  model  is  limited.  In  order  to  obtain  API 
grades,  an  applicant  would  first  have  to  complete  6 weeks 
of  ground  school.  Nevertheless,  there  are  other  useful 
applications  for  such  a model.  One  example  would  be  to 
use  it  for  aviation  progress  review  boards.  These  boards 
evaluate  students  who  are  having  difficulty  in  flight 
training,  and  board  members  must  make  a 
recommendation  on  whether  the  student  should  be  retained 
or  separated  from  training.  Any  tool  that  can  provide  an 
objective  assessment  of  the  student  would  be  extremely 
useful  to  the  board.  If  this  model  can  be  validated,  it 
would  certainly  fill  this  role. 

The  second  regression  was  performed  to  analyze  variables 
that  could  be  made  available  before  an  individual  entered 
training.  AQR  and  PFAR  scores  meet  this  requirement.  If 


a version  of  the  CBPT  could  be  implemented,  these  types 
of  scores  would  be  available  as  well.  The  second 
regression  showed  that  a model  incorporating  2-D  tracking 
and  PFAR  scores  accounted  for  17%  of  the  primary  flight 
grade  variance  in  our  sample.  While  not  as  strong  as  a 
model  that  can  include  API  grades,  it  is  an  improvement 
over  the  current  practice  of  using  PFAR  scores  alone. 

The  final  issue  to  be  addressed  is  that  the  results  of  these 
analyses  come  at  an  opportune  time.  NAMRL  has  recently 
introduced  the  Automated  Pilot  Examination  (APEX) 
system,  which  is  a computer-based  and  networked  version 
of  the  ASTB.  APEX  has  been  successfully  operating  at 
several  recruiting  sites  for  the  past  year,  and  it  has 
performed  well.  Because  APEX  is  computer-based,  it 
should  be  possible  to  include  portions  of  the  CBPT  into 
APEX.  The  analyses  reported  here  indicate  that  the  2-D 
tracking  task/DLT  combination  is  a good  candidate  for 
inclusion.  This  would  greatly  facilitate  validation  efforts, 
because  any  applicant  who  was  tested  on  the  APEX  system 
would  also  be  providing  2-D  tracking  task  data  (even 
though  those  data  would  not  be  used  for  selection  purposes 
during  this  validation  phase).  Some  of  these  applicants 
would  eventually  enter  aviation  training,  and  when  they 
did  their  CBPT  data  would  already  be  available  for 
analysis  and  validation.  In  this  manner,  NAMRL  could 
continue  to  make  significant  contributions  in  improving 
the  process  of  selecting  U.S.  naval  aviators. 
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